智能论文笔记

Multi-View Reconstruction using Signed Ray Distance Functions (SRDF)

Pierre Zins , Yuanlu Xu , Edmond Boyer , Stefanie Wuhrer , Tony Tung

分类：计算机视觉

2022-08-31

在本文中，我们解决了多视图3D形状重建的问题。尽管最近与隐式形状表示相关的最新可区分渲染方法提供了突破性的表现，但它们仍然在计算上很重，并且在估计的几何形状上通常缺乏精确性。为了克服这些局限性，我们研究了一种基于体积的新型表示形式建立的新计算方法，就像在最近的可区分渲染方法中一样，但是用深度图进行了参数化，以更好地实现形状表面。与此表示相关的形状能量可以评估给定颜色图像的3D几何形状，并且不需要外观预测，但在优化时仍然受益于体积整合。在实践中，我们提出了一个隐式形状表示，SRDF基于签名距离，我们通过沿摄像头射线进行参数化。相关的形状能量考虑了深度预测一致性和光度一致性之间的一致性，这是在体积表示内的3D位置。可以考虑各种照片一致先验的基础基线，或者像学习功能一样详细的标准。该方法保留具有深度图的像素准确性，并且可行。我们对标准数据集进行的实验表明，它提供了有关具有隐式形状表示的最新方法以及传统的多视角立体方法的最新结果。

translated by 谷歌翻译

Spatio-temporal motion completion using a sequence of latent primitives

Mathieu Marsot , Stefanie Wuhrer , Jean-Sebastien Franco , Anne Hélène Olivier

分类：计算机视觉

2022-06-27

我们提出了一种无标记的性能捕获方法，该方法从稀疏采样的未跟踪3D点云的稀疏采样序列中计算随时间变形的参与者变形的时间相干4D表示。我们的方法通过以前的时空运动来进行潜在优化。最近，已经引入了任务通用运动先验，并提出了基于单个潜在代码的人类运动的连贯表示，并具有简短序列和给定时间对应关系的令人鼓舞的结果。将这些方法扩展到没有对应的较长序列几乎是直接的。一种潜在代码证明，由于可能的倒置姿势配件，因此对长期可变性的编码效率低下，而潜在空间优化将非常容易受到错误的本地最小值。我们通过学习一个运动来解决这两个问题，该动作将4D人体运动序列编码为一系列潜在的原语，而不是一个潜在的代码。我们还提出了一个附加的映射编码器，该编码器将点云直接投入到学习的潜在空间中，以在推理时提供潜在表示的良好初始化。我们从潜在空间进行的时间解码是隐式和连续的，可以通过时间分辨率提供灵活性。我们通过实验表明我们的方法优于最先进的运动先验。

translated by 谷歌翻译

Neural Human Deformation Transfer

Jean Basset , Adnane Boukhayma , Stefanie Wuhrer , Franck Multon , Edmond Boyer

分类：计算机视觉

2021-09-03

我们认为人类变形转移问题，目标是在不同角色之间的零件姿势。解决此问题的传统方法需要清晰的姿势定义，并使用此定义在字符之间传输姿势。在这项工作中，我们采取了不同的方法，将角色的身份转换为新的身份，而无需修改角色的姿势。这提供了不必在3D人类姿势之间定义等效性的优点，这在姿势往往会根据执行它们的角色的身份而变化并不简单，并且由于它们的含义是高度上下文的。为了实现变形转移，我们提出了一种神经编码器 - 解码器架构，其中仅编码身份信息以及解码器在姿势上调节的位置。我们使用姿势独立表示，例如等距 - 不变形状特征，以表示身份特征。我们的模型使用这些功能来监督从变形姿势的偏移预测到转移结果。我们通过实验展示了我们的方法优于最先进的方法，定量和定性，并且更好地推广在训练期间没有看到。我们还介绍了一个微调步骤，可以为极端身份获得竞争力的结果，并允许转移简单的衣服。

translated by 谷歌翻译

A structured latent space for human body motion generation

Mathieu Marsot , Stefanie Wuhrer , Jean-Sebastien Franco , Stephane Durocher

分类：计算机视觉

2021-06-07

我们提出了一个框架来学习一个结构化的潜在空间来代表4D人体运动，其中每个潜在向量都编码整个3D人类形状的全部运动。一方面，存在一些数据驱动的骨骼动画模型，提出了时间密集运动信号的运动空间，但基于几何稀疏的运动学表示。另一方面，存在许多方法来构建密集的3D几何形状的形状空间，但对于静态帧。我们将两个概念汇总在一起，提出一个运动空间，该运动空间在时间和几何上都很密集。经过训练后，我们的模型将基于低维潜在空间中的单个点生成多帧序列。该潜在空间是构建为结构化的，因此类似的运动形成簇。它还嵌入了潜在矢量中的持续时间变化，允许语义上的接近序列，这些序列仅因时间展开而不同以共享相似的潜在矢量。我们通过实验证明了潜在空间的结构特性，并表明它可用于在不同动作之间生成合理的插值。我们还将模型应用于4D人类运动的完成，显示其有希望学习人类运动时空特征的能力。

translated by 谷歌翻译

Data-Driven 3D Reconstruction of Dressed Humans From Sparse Views

Pierre Zins , Yuanlu Xu , Edmond Boyer , Stefanie Wuhrer , Tony Tung

分类：计算机视觉

2021-04-16

最近，数据驱动的单视图重建方法在建模3D穿着人类中表现出很大的进展。然而，这种方法严重影响了单视图输入所固有的深度模糊和闭塞。在本文中，我们通过考虑一小部分输入视图并调查从这些视图中适当利用信息的最佳策略来解决这个问题。我们提出了一种数据驱动的端到端方法，其从稀疏相机视图重建穿着人的人类的隐式3D表示。具体而言，我们介绍了三个关键组件：首先是使用透视相机模型的空间一致的重建，允许使用人员在输入视图中的任意放置;第二个基于关注的融合层，用于从多个观点来看聚合视觉信息;第三种机制在多视图上下文下编码本地3D模式。在实验中，我们展示了所提出的方法优于定量和定性地在标准数据上表达现有技术。为了展示空间一致的重建，我们将我们的方法应用于动态场景。此外，我们在使用多摄像头平台获取的真实数据上应用我们的方法，并证明我们的方法可以获得与多视图立体声相当的结果，从而迅速更少的视图。

translated by 谷歌翻译

Optimal algorithms for group distributionally robust optimization and beyond

Tasuku Soma , Khashayar Gatmiry , Stefanie Jegelka

分类：机器学习

2022-12-28

Distributionally robust optimization (DRO) can improve the robustness and fairness of learning methods. In this paper, we devise stochastic algorithms for a class of DRO problems including group DRO, subpopulation fairness, and empirical conditional value at risk (CVaR) optimization. Our new algorithms achieve faster convergence rates than existing algorithms for multiple DRO settings. We also provide a new information-theoretic lower bound that implies our bounds are tight for group DRO. Empirically, too, our algorithms outperform known methods

translated by 谷歌翻译

Investigation of reinforcement learning for shape optimization of profile extrusion dies

Clemens Fricke , Daniel Wolff , Marco Kemmerling , Stefanie Elgeti

分类：机器学习

2022-12-23

Profile extrusion is a continuous production process for manufacturing plastic profiles from molten polymer. Especially interesting is the design of the die, through which the melt is pressed to attain the desired shape. However, due to an inhomogeneous velocity distribution at the die exit or residual stresses inside the extrudate, the final shape of the manufactured part often deviates from the desired one. To avoid these deviations, the shape of the die can be computationally optimized, which has already been investigated in the literature using classical optimization approaches. A new approach in the field of shape optimization is the utilization of Reinforcement Learning (RL) as a learning-based optimization algorithm. RL is based on trial-and-error interactions of an agent with an environment. For each action, the agent is rewarded and informed about the subsequent state of the environment. While not necessarily superior to classical, e.g., gradient-based or evolutionary, optimization algorithms for one single problem, RL techniques are expected to perform especially well when similar optimization tasks are repeated since the agent learns a more general strategy for generating optimal shapes instead of concentrating on just one single problem. In this work, we investigate this approach by applying it to two 2D test cases. The flow-channel geometry can be modified by the RL agent using so-called Free-Form Deformation, a method where the computational mesh is embedded into a transformation spline, which is then manipulated based on the control-point positions. In particular, we investigate the impact of utilizing different agents on the training progress and the potential of wall time saving by utilizing multiple environments during training.

translated by 谷歌翻译

Objective Surgical Skills Assessment and Tool Localization: Results from the MICCAI 2021 SimSurgSkill Challenge

Aneeq Zia , Kiran Bhattacharyya , Xi Liu , Ziheng Wang , Max Berniker , Satoshi Kondo , Emanuele Colleoni , Dimitris Psychogyios , Yueming Jin , Jinfan Zhou

分类：计算机视觉

2022-12-08

Timely and effective feedback within surgical training plays a critical role in developing the skills required to perform safe and efficient surgery. Feedback from expert surgeons, while especially valuable in this regard, is challenging to acquire due to their typically busy schedules, and may be subject to biases. Formal assessment procedures like OSATS and GEARS attempt to provide objective measures of skill, but remain time-consuming. With advances in machine learning there is an opportunity for fast and objective automated feedback on technical skills. The SimSurgSkill 2021 challenge (hosted as a sub-challenge of EndoVis at MICCAI 2021) aimed to promote and foster work in this endeavor. Using virtual reality (VR) surgical tasks, competitors were tasked with localizing instruments and predicting surgical skill. Here we summarize the winning approaches and how they performed. Using this publicly available dataset and results as a springboard, future work may enable more efficient training of surgeons with advances in surgical data science. The dataset can be accessed from https://console.cloud.google.com/storage/browser/isi-simsurgskill-2021.

translated by 谷歌翻译

On the generalization of learning algorithms that do not converge

Nisha Chandramoorthy , Andreas Loukas , Khashayar Gatmiry , Stefanie Jegelka

分类：机器学习 | (统计)机器学习

2022-08-16

深度学习的概括分析通常假定训练会收敛到固定点。但是，最近的结果表明，实际上，用随机梯度下降优化的深神经网络的权重通常无限期振荡。为了减少理论和实践之间的这种差异，本文着重于神经网络的概括，其训练动力不一定会融合到固定点。我们的主要贡献是提出一个统计算法稳定性（SAS）的概念，该算法将经典算法稳定性扩展到非convergergent算法并研究其与泛化的联系。与传统的优化和学习理论观点相比，这种崇高的理论方法可导致新的见解。我们证明，学习算法的时间复杂行为的稳定性与其泛化有关，并在经验上证明了损失动力学如何为概括性能提供线索。我们的发现提供了证据表明，即使训练无限期继续并且权重也不会融合，即使训练持续进行训练，训练更好地概括”的网络也是如此。

translated by 谷歌翻译

RLang: A Declarative Language for Expression Prior Knowledge for Reinforcement Learning

Rafael Rodriguez-Sanchez , Benjamin Spiegel , Jennifer Wang , Roma Patel , Stefanie Tellex , George Konidaris

分类：人工智能 | 机器学习

2022-08-12

将有用的背景知识传达给加强学习（RL）代理是加速学习的重要方法。我们介绍了Rlang，这是一种特定领域的语言（DSL），用于将域知识传达给RL代理。与RL社区提出的其他现有DSL不同，该基础是决策形式主义的单个要素（例如，奖励功能或政策功能），RLANG可以指定有关马尔可夫决策过程中每个元素的信息。我们为rlang定义了精确的语法和基础语义，并提供了解析器实施，将rlang程序基于算法 - 敏捷的部分世界模型和政策，可以由RL代理利用。我们提供一系列示例RLANG程序，并演示不同的RL方法如何利用所得的知识，包括无模型和基于模型的表格算法，分层方法和深度RL算法（包括策略梯度和基于价值的方法）。

translated by 谷歌翻译